Camera coupled reference frame

ABSTRACT

Techniques are provided for managing long-term reference frames (LTRFs) for two or more video sources. A first video source is selected from a plurality of video sources. The first video source is encoded to produce an encoded video stream, where a reference frame message identifies a recent video frame as long-term reference frame (LTRF) associated with the first video stream. The process is repeated for other video streams. The LTRF associated with the first video stream is used as a reference for temporal predictive coding upon receiving a signal that the first video source has been re-selected.

TECHNICAL FIELD

The present disclosure relates to video encoding and particularly totechniques for managing long-term reference frames for a plurality ofvideo streams.

BACKGROUND

The Motion Pictures Expert Group (MPEG) 4 advanced video coding (AVC)standard specifies the use of long-term reference frames (LTRFs). LTRFsare designed to have a greater persistence than traditional MPEG-2I-frames and can be a designated B-frame, P-frame, or any othertemporally predictive INTER frame requiring less bandwidth than anI-frame (or other fully independently coded frame). I-frames usuallypersist for the duration of a group of pictures (GOPs) and can also beused as a reference frame for error correction when errors occur overthe transmission network. On the other hand, LTRFs can persistindefinitely and can also be used be used as a reference frame whenerrors occur over the network. The MPEG-4 standard defines a number ofslots or placeholders in memory for LTRFs. LTRFs are generated by theencoder and identified in the encoded video stream. Both the encoder anddecoder store the LTRFs.

When using LTRFs error feedback from the decoder is desired because anyerror in a frame (LTRF or not) will continue to propagate over time,i.e., future frames will continue to reference (or motion predict from)bad data in another decoded frame. The decoder can inform the encoder oferrors such as a lost or damaged packet. An encoder receiving suchfeedback can correct the error, e.g., by encoding the next frame in sucha way that it does not reference the erroneous data. One method tocorrect errors is to reference an LTRF that has been confirmed by thedecoder not to contain errors and thereby avoid the need to send ahigher bandwidth I-frame. In this case, the encoder needs to indicate tothe decoder which LTRF is the reference frame for predictive coding.Error feedback also ensures that the LTRFs stored at the encoder anddecoder are the same.

BRIEF DESCRIPTION OF THE DRAWINGS

The features and advantages of camera coupled LTRFs of the presentdisclosure will become apparent upon consideration of the followingdescription of example embodiments thereof, particularly when taken inconjunction with the accompanying drawings wherein like referencenumerals in the various figures are utilized to designate likecomponents.

FIG. 1 illustrates an example of a block diagram of a system with anencoder employing a LTRF management process in accordance with anembodiment of the present invention.

FIG. 2 illustrates an example of a block diagram of an encoder employinga LTRF management process in accordance with an embodiment of thepresent invention.

FIG. 3 illustrates an example of a flow chart of the LTRF managementprocess in accordance with an embodiment of the present invention.

DESCRIPTION OF EXAMPLE EMBODIMENTS

Overview

Techniques are provided for managing LTRFs for two or more videostreams. A first signal is received indicating a first video stream isselected from a plurality of video streams. The first video stream isencoded to produce an encoded video stream, where a reference framemessage identifies a video frame as long-term reference frame (LTRF)associated with the first video stream. A second signal is receivedindicating a second video stream is selected from among the plurality ofvideo streams. The second video stream is encoded and forms acontinuation of the encoded video stream. A third signal is receivedindicating that the first video stream is re-selected, and on receipt ofthe third signal the LTRF is used as a reference for temporal predictivecoding.

Techniques are also provided for a decoder to decode the encoded videostream and store the LTRFs identified in the reference frame message. Ifthe reference frame message is not received or if the LTRF does notdecode properly then the decoder can send feedback to the encoder. Theencoder can use the feedback and knowledge of channel conditions toadjust the encoding scheme by using well-known error resilience or errorconcealment techniques. One well-known technique is Reference PictureSelection (RPS) described in Annex N of the InternationalTelecommunications Union (ITU) H.263+ Recommendation. For a descriptionof error resilience techniques with respect to long-term referenceframes, see, Thomas Wiegand, Niko Färber, Klaus Stuhlmüller, and BerndGirod: “Error-Resilient Video Transmission Using Long-Term MemoryMotion-Compensated Prediction,” IEEE Journal on Selected Areas inCommunications, Vol. 18, No. 6, pp. 1050-1062, June 2000.

MPEG standards provide a high degree of compression by encoding blocksof pixels using various techniques and then using motion compensation toencode most video frames (or slices) as predictions from or betweenother frames. In particular, an encoded MPEG video stream is comprisedof a series of GOPs (or group of blocks), and each GOP begins with anindependently encoded I-frame (INTRA-coded frame) and may include one ormore following INTER-coded frames, such as P-frames (predictive-codedframe) or B-frames (bi-directionally predictive-coded frame). EachI-frame can be decoded independently and without additional information.Decoding of a P-frame requires information from a preceding frame in theGOP. Decoding of a B-frame requires information from a preceding and afollowing frame in the GOP. Because B and P-Frames can be decoded usinginformation from other frames they require less bandwidth whentransmitted.

As video streams are switched, e.g., from one camera to another, the newpicture may be drastically different from the previous picture andINTER-prediction may fail, and the encoder may resort toINTRA-prediction. This can cause a temporary interruption in the videoquality at the destination each time the video stream is switched.INTRA-prediction results in lower picture quality compared toINTER-prediction for any given bandwidth (or bit-rate budget).

Embodiments disclosed herein are generally described with respect to anencoder and a decoder that conform to at least some parts of the ITUH.264 Recommendation (MPEG-4 Part 10). It should be understood thatembodiments described herein, however, are not restricted to H.264, andcan be used for any temporally coded compressed video that allows forthe use of LTRFs or similar constructs. Embodiments are also generallydescribed with respect to whole frames. It should be understood that thetechniques described herein may be applied to partial frames such as agroup of blocks or slices, i.e., spatially separated portions of a frameor picture.

In certain applications, such as cable television, many programs areavailable for decoding. The user may switch channels or the programmingmay switch to an advertisement. So, in general, the encoder mustconstantly update the LTRFs associated with the video stream as thescenes change. However, in certain other applications, such asteleconferencing or security systems, the background scenes do notchange appreciably over time. Since the backgrounds do not change, theLTRFs for these applications can persist for a greater length of time oreven indefinitely. Motion compensation for a teleconference willgenerally be limited to facial expressions and gestures. These otherapplications may also incorporate more than one video stream (e.g., acamera feed), selected one at a time, into a single an encoded videostream for transmission to another location. Embodiments describedherein take advantage of the scene characteristics of these otherapplications to store LTRFs for a plurality of video streams even whenthe video stream is not being encoded.

Example Embodiments

Referring first to FIG. 1, a block diagram of a system configured to usea LTRF management process 300 is depicted generally at reference numeral100. The system comprises a video switch 120, a video encoder 140, anetwork 160, and a video decoder 170. The video switch 120 is connectedto three cameras 110(1)-110(3) and is configured to receive a selectionsignal 155 in order to select one camera and feed a raw video stream 130to the video encoder 140 from the selected camera. Although threecameras are shown, any number of cameras or video stream may be used.The video switch 120 may also contain an interface (not shown) forcommunication with the video encoder 140 and vice versa.

The selection signal 155 is generated by a selector 150 configured tointerface with sensors to detect sound (e.g., microphones 115), motion,a user input, or other detectable means such as, e.g., a change inbackground light intensity or a change in temperature, or combinationsthereof (the remaining sensor types are not shown). In one exampleembodiment, the sensors for a teleconferencing application may beseveral microphones 115(1)-115(3) associated with the cameras110(1)-110(3), respectively. For example, a person's voice may cause theselector 150 to select a camera feed associated with the personspeaking.

In another example embodiment, the selector 150 may be coupled to motiondetectors for a security system. When motion is detected the selector150 selects a camera that covers the area associated with the motiondetector. The video switch 120 then switches to the appropriate camera.If none of the motion detectors detect motion then the video switch 120may periodically switch from camera to camera. The selector 150 may alsodetect a user input allowing the video switch 120 to select a camerafeed desired by the user. Or in another embodiment the user input mayoverride previously selected feeds that were selected based on soundand/or motion.

The relationships depicted in FIG. 1 between the cameras 110, themicrophones 115, the video switch 120, the selector 150, and the videoencoder 140 are simplified and not meant to be limiting, e.g., the videoswitch 120 may be separate or part of the video encoder 140 and thesensors may be individually coupled to a camera and the selector 150. Inanother example, the selector 150 may be part of the video encoder 140in which case the video encoder 140 sends a signal to the video switch120 indicating which video stream to select.

The selection signal 155 is also sent to an LTRF manager 165 in thevideo encoder 140 to inform the LTRF manager 165 which camera has beenselected. Once selected, the raw video stream 130 is routed to the videoencoder 140. The video encoder 140, along with LTRF manager 165, employsan LTRF management process 300 (hereinafter “the process”) in accordancewith an embodiment of the present invention. The video encoder 140 andthe process 300 are described in more detail later in conjunction withFIGS. 2 and 3 respectively. The video encoder 140 encodes the raw videostream 130 to produce an encoded video stream 135. The encoded videostream 135 may be an MPEG-2 or MPEG-4 transport stream or the videoencoder 140 may use any other protocol suitable for transporting audio,video, and data over a network. Both the video encoder 140 and the videodecoder 170 have a LTRF storage 180/190 or other storage device (notshown) for storing both short term reference frames and LTRFs generatedby the process 300, and the video encoder 140 and the video decoder 170each have a network interface card (NIC, only shown for the encoder inFIG. 2) for communication over the network 160.

The network 160 may be an Internet Protocol (IP) network, anAsynchronous Transfer Mode (ATM) network, or any other network suitablefor transmitting encapsulated audio, video, or other data. It should beunderstood that portions of the network 160 may include optical fibers,hybrid fiber coax, coax, cable modems, servers, intermediate networkingcomponents, and the like. The communication between the video encoder140 and the video decoder 170 is bi-directional. In general, compressedaudio and video are sent from the video encoder 140 to the video decoder170 and feedback is sent from the video decoder 170 to the video encoder140, although other types of communication are possible. For simplicity,FIG. 1 depicts a single communication path over the network 160. It ispossible for multiple communications paths and or networks to beemployed. For example, if it is anticipated that data requiring highbandwidth, like video data, will be transmitted in one direction anddata requiring a lower bandwidth, like feedback, will be transmitted inthe other direction then it may be more efficient to use two differentnetworks, one for each direction.

Turning now to FIG. 2, the video encoder 140 from FIG. 1 is shown ingreater detail. The video encoder 140 contains a memory 210, acontroller 220, an encoder decoder (ECD) 230, a network interface card(NIC) 240, and a motion estimation block 260. The functions of thecontroller 220 may be implemented by logic encoded in one or moretangible media (e.g., embedded logic such as an application specificintegrated circuit (ASIC), digital signal processor (DSP) instructions,software that is executed by a processor, etc.). In addition to storingreference frames 180, the memory 210 stores data used for the methodsdescribed herein, e.g., metadata for a video source to LTRF map 250and/or software or processor instructions that are executed to carry outthe methods described herein. The memory may separate or part of thecontroller 220 or a combination thereof. Thus, the process 300 may beimplemented with fixed logic or programmable logic (e.g.,software/computer instructions executed by a processor) and thecontroller 220 may be a programmable processor, programmable digitallogic (e.g., field programmable gate array (FPGA)) or an ASIC thatcomprises fixed digital logic, or a combination thereof.

The ECD 230 is configured to encode video and audio to form an outgoingencoded video stream 135 for transmission via the NIC 240. The NIC 240may also be configured to receive an encoded video stream and the ECD230 may then decode the incoming encoded video stream (not shown) suchas might be the case if a video conference were being conducted. Thus,the video encoder 140 and the video decoder 170 may be identical piecesof hardware and more than one encoder may be employed at any onelocation. It will be appreciated by those skilled in the art the ECD 230may be an on-chip hardware accelerator in the case the controller 220 isa specialized video DSP like the Texas Instruments TMS320DM64x series,or the ECD 230 may be an off-chip device such as an FPGA.

Referring now to FIG. 3, with continued reference to FIG. 2, a process300 for managing LTRFs is described. At 310, a first signal is receivedindicating a first video stream is selected from a plurality of videostreams (e.g., raw video stream 130). The video stream is selected usingthe aforementioned or other techniques. The selection signal 155 is alsosent to an LTRF manager 165 in the video encoder 140 to inform the LTRFmanager 165 which video stream has been selected. The LTRF manager 165maps the source selection signal into an LTRF selection signal.

At 320, the first video stream is encoded to produce an encoded videostream (e.g., encoded video stream 135). Within that stream, the encoderidentifies a particular frame to become a LTRF. The encoder stores thisframe locally, designates it as an LTRF, and signals this designationwithin the stream using a reference frame message.

At 330, a second signal is received indicating a second video stream isselected. The selection signal 155 is sent to an LTRF manager 165 in thevideo encoder 140 to inform the LTRF manager 165 which video stream hasbeen selected. Again the encoder identifies a particular frame to becomea LTRF. The encoder stores this frame locally, designates it as an LTRF,and signals this designation within the stream using a reference framemessage. However, in accordance with an embodiment, the new LTRF doesnot replace the first LTRF in memory. The new LTRF is held in a separate“slot”.

In one example embodiment the LTRF storage 180 has a fixed number ofslots to store LTRFs. The number of slots depends on the codingstandard, system design, and other considerations like the number ofvideo feeds the video encoder 140 may have to select from. In thefollowing example the encoder has 16 slots to store LTRFs. Accordingly,the decoder's 170 LTRF storage 190 (FIG. 1) would also be designed with16 slots. The 16 LTRF slots may be divided among the number of videostreams evenly or according to other system requirements like the amountof motion compensation that must be performed for one stream versusanother. If, e.g., the encoder has three video streams to select from(e.g., from cameras 110(1)-110(3)), then each stream will get five slotsfor a total of 15 with one spare slot, or two streams could be allocatedfour slots with the remaining stream getting eight slots. In anotherexample, if the encoder has four video streams to select from then eachvideo stream could be allocated four LTRF slots.

The LTRFs are each stored with some corresponding metadata in the videosource to LTRF map 250. The video source to LTRF map 250 and the LTRFstorage 180 are managed by the LTRF manager 165. For example, themetadata could comprise a video stream identification (video_stream_ID),a long term picture number (long_term_pic_num), and an LTRF status, eachassigned by, e.g., the LTRF manager 165 and stored in the video sourceto LTRF map 250. It is this metadata that forms the association betweena video stream and the LTRFs. Thus, when encoding another selected videostream, the video encoder 140 can immediately use an LTRF that isalready stored and associated with the newly selected video streamwithout resorting to INTRA-prediction. Table 1 indicates one possibledata structure for metadata stored in the video source to LTRF map 250and LTRF storage 180:

TABLE 1 Example data structure for LTRF storage Metadata in the videosource to LTRF map 250 LTRF LTRF video_stream_ID long_term_pic_numStatus storage 180 2 1 ACKed LTRF data 2 2 ACKed LTRF data 2 3 ACKedLTRF data 2 4 ACKed LTRF data 2 5 ACKed LTRF data 1 6 ACKed LTRF data 17 ACKed LTRF data 1 8 ACKed LTRF data 1 9 NACKed LTRF data 1 10 createdLTRF data 3 11 ACKed LTRF data 3 12 ACKed LTRF data 3 13 ACKed LTRF data3 14 ACKed LTRF data 3 15 empty empty none none empty Spare

The data in Table 1 represents the example given earlier in which theencoder has three video streams to select from and each stream isassigned 5 LTRF slots in memory. The data structure in Table 1 indicatesthat at one time or another each video stream was selected long enoughfor a set of LTRF data to be generated for the corresponding videostream. The data structure indicates that video stream 2 may have beenthe first video stream selected (e.g., from camera 110(2)), followed byvideo stream 1 (camera 110(1)), however no particular significance isgiven to the order or values in Table 1 other than the long_term_pic_nummust be unique for any given LTRF data.

The LTRF manager 165 controls the association of video source(video_stream_ID) to LTRF data, and also maintains the LTRF status ofeach LTRF slot. An LTRF slot may be empty, it may be created (but notacknowledged) or it may be ACKed or NACKed by the decoder as describedbelow.

If, for example, the video encoder 140 is currently encoding videostream number 2, then as can be seen from the long_term_pic_num, LTRFstatus, and LTRF storage columns in Table 1 that the video encoder 140has LTRFs 1-5 to choose from for temporal prediction because each LTRFhas data available and each LTRF has been ACKed by the decoder as a goodLTRF. If the video stream is switched to, e.g., video stream number 1,then the video encoder 140 immediately has LTRFs 6-8 to choose from fortemporal prediction (with LTRF 9 being NACKed by the decoder and LTRF 10yet to be ACKed/NACKed). The video encoder 140 also has the option ofidentifying a new LTRF and replacing one of the 5 allocated LTRFs withthe new LTRF in the LTRF storage 180. Table 1 would be updatedaccordingly.

As new LTRFs are identified and stored the long_term_pic_nums willchange or increment. The long_term_pic_nums are sent in the referenceframe message by the video encoder 140 to identify to the decoder whichLTRF data to reference during the decoding process. The data in Table 1are replicated at the decoder with the exception of the video_stream_IDwhich is not needed by the decoder. Thus the decoder “sees” normallyencoded video and the process 300 is transparent to the decoder.

Referring again to FIG. 3, at 340, the second video stream is encodedforming a continuation of the encoded video stream. At 350, a thirdsignal is received indicating that the first video stream has beenre-selected. The third signal informs the encoder 140 that the firstvideo source has been re-selected and the encoder 140 takes a specialaction. In the case where an LTRF associated with that source has beenACKed, the encoder 140 tries to use that LTRF. At 360, the encoder 140uses that LTRF as a reference frame for temporal predictive coding. Thenew frame, to be encoded, is compared to the associated LTRF in themotion estimation block 260. Since the new picture is essentially thesame scene as the LTRF, the predictive coding will be successful and thenew picture can be encoded very efficiently, achieving high qualitywithout a large expenditure of bits. This is the benefit of thistechnique. The benefit is most noticeable when the bitrate isconstrained per frame, as it usually is in telecommunications systems.

The method may be repeated for any other video stream selected fromamong the plurality of video streams and encoded by the same encoderthat encoded the first video stream. If any video stream already has astored LTRF associated therewith, then the stored LTRF is used fortemporal predictive coding.

In another example embodiment, the video encoder 140 receives errorfeedback 260 (FIG. 2) from the decoder indicating whether or not theLTRF was correctly decoded and that the reference frame message wascorrectly received. If either the LTRF was incorrectly decoded or thereference frame message was incorrectly received then the referenceframe message is resent until the error feedback indicates that thereference frame message was correctly received and an identified LTRFwas correctly decoded. It should be understood that any new referenceframe message used to correct any transmission/decoding error mayreference an entirely new LTRF and that the encoder is not required tocorrect any previously sent LTRF that was received by the decoder witherrors.

For error feedback, the decoder can send a positive acknowledgement(ACK) indicating that the LTRF was correctly decoded or a negativeacknowledgment (NACK) indicating that the LTRF was incorrectly decoded(or that any associated packets were damaged in transit). The encoderand decoder can be designed to use only ACKs, in which case onlycorrectly decoded frames are acknowledged and any LTRFs not ACK'd areconsidered to contain errors. Alternately, encoder and decoder can bedesigned to use only NACKs in which case only incorrectly decoded framesare acknowledged and any LTRFs not NACK'd are considered to be errorfree. In a preferred embodiment both ACKs and NACKs are employed. Itshould be understood that the decoder does not have to wait for anentire frame to be decoded before sending ACKs/NACKs, thus reducinglatency in error correction and mitigating any latency in thetransmission network (like network 160 shown in FIG. 1). The frame canby subdivided into slices (or a group of blocks), in which case theACKs/NACKs can be sent by the decoder for each slice.

Techniques are provided for managing long-term reference frames (LTRFs)for two or more video streams. Embodiments describe herein takeadvantage of the fact that background scenes do not change appreciablyover time for certain applications such as teleconferencing. Thus, LTRFscan be stored for a plurality of video streams even when the videostream is not being encoded and thereby avoid the use of independentlycoded frames.

Although the apparatus, system, and method are illustrated and describedherein as embodied in one or more specific examples, it is neverthelessnot intended to be limited to the details shown, since variousmodifications and structural changes may be made therein withoutdeparting from the scope of the apparatus, system, and method and withinthe scope and range of equivalents of the claims. Accordingly, it isappropriate that the appended claims be construed broadly and in amanner consistent with the scope of the apparatus, system, and method,as set forth in the following claims.

1. A method comprising: receiving a first signal from a signal sourceselection device coupled to a plurality of audio sources each associatedwith a corresponding video source of a plurality of video streamsindicating a first video stream is selected from the plurality of videostreams; encoding the first video stream to produce an encoded videostream, wherein a reference frame message identifies a video frame as along-term reference frame (LTRF) and the LTRF is associated with thefirst video stream; receiving a second signal from the signal sourceselection device indicating a second video stream is selected from theplurality of video streams; encoding the second video stream forming acontinuation of the encoded video stream; and receiving a third signalfrom the signal source selection device indicating that the first videostream is re-selected, and on receipt of the third signal, using theLTRF associated with the first video stream as a reference for temporalpredictive coding.
 2. The method of claim 1, wherein selecting comprisesselecting based on predetermined criteria comprising detecting sound,detecting motion, and/or detecting a user input.
 3. The method of claim1, wherein encoding comprises encoding using a coding standardcomprising one of International Telecommunications Union (ITU) H.263,Motion Pictures Expert Group (MPEG)-4 AVC, Audio Video Standard (AVS),and VC1.
 4. The method of claim 1, wherein encoding further comprisesidentifying additional LTRFs in the reference frame message associatedwith the first or second video stream.
 5. The method of claim 1, whereinthe LTRF is a Motion Pictures Expert Group (MPEG) 4 AVC LTRF.
 6. Themethod of claim 1, wherein encoding further comprises periodicallyupdating an LTRF associated with the first or second video stream with anew LTRF by way of a reference frame message.
 7. The method of claim 1,further comprising receiving error feedback from a decoder indicatingwhether or not the LTRF identified in the reference frame message wascorrectly decoded and that the reference frame message was correctlyreceived at the decoder, wherein if either the LTRF was incorrectlydecoded or the reference frame message was incorrectly received thenresending the reference frame message until the error feedback indicatesthat the reference frame message was correctly received and theidentified LTRF was correctly decoded.
 8. An apparatus comprising: avideo source selection signal pathway coupled to a plurality of audiosources each associated with a corresponding video source of a pluralityof video streams; a switch configured to select a video source from theplurality of video sources according to the video source selectionsignal; an encoder configured to encode the selected video source toproduce an encoded video stream, wherein messages within the encodedvideo stream identify certain video frames as long-term reference frames(LTRF); and logic configured to select a particular LTRF for referencein temporal predictive coding according the video source selectionsignal.
 9. The apparatus of claim 8, wherein the switch is configured toselect a video stream based on predetermined criteria comprisingdetecting sound, detecting motion, and/or detecting a user input. 10.The apparatus of claim 8, wherein the encoder is configured to encodeusing a coding standard comprising one of ITU H.263, a Motion PicturesExpert Group (MPEG)-4 AVC, AVS, and VC1.
 11. The apparatus of claim 8,wherein the encoder is further configured to identify additional LTRFsassociated with the selected video source in the messages.
 12. Theapparatus of claim 8, wherein the encoder is configured to identifyLTRFs comprising a Motion Pictures Expert Group (MPEG)-4 AVC LTRFs. 13.The apparatus of claim 8, wherein the encoder is configured toperiodically update LTRFs associated with the selected video source withnew LTRFs, and wherein the new LTRFs are identified in the messages. 14.The apparatus of claim 8, wherein the encoder is further configured toreceive error feedback from a decoder indicating whether or not the LTRFidentified in the reference frame message was correctly decoded and thatthe reference frame message was correctly received at the decoder,wherein if either the reference frame message was incorrectly receivedor the LTRF was incorrectly decoded then resending the reference framemessage until the error feedback indicates that the reference framemessage was correctly received and the identified LTRF was correctlydecoded.
 15. Logic encoded in one or more non-transitory media forexecution and when executed operable to: select a first video sourcefrom a plurality of video sources based on a source selection signalfrom a signal source selection device coupled to a plurality of audiosources each associated with a corresponding video source of a pluralityof video streams; encode the first video source to produce an encodedvideo stream, wherein a reference frame message identifies a recentvideo frame as an LTRF associated with the first video source; select asecond video source from the plurality of video sources based on thesource selection signal; encode the second video source into the sameencoded video stream; in response to the video source selection signal,reselect the first video stream, and encode the first frame of there-selected first video source with a temporally predicted picture whichrefers to the LTRF associated with the first video source.
 16. The logicof claim 15, wherein the logic that selects the first or second videosource is operable to select based on predetermined criteria comprisingdetecting sound, detecting motion, and/or detecting a user input. 17.The logic of claim 15, wherein the logic that encodes is operable to usea coding standard comprising one of ITU H.263, a Motion Pictures ExpertGroup (MPEG)-4 AVC, Audio Video Standard (AVS), and VC1.
 18. The logicof claim 15, wherein the logic that encodes is further operable toidentify additional LTRFs in the reference frame message.
 19. The logicof claim 15, wherein the logic that encodes is further operable toupdate an LTRF associated with the first or second video source fromtime to time with a new LTRF by way of a reference frame message. 20.The logic of claim 15, further comprising logic to receive errorfeedback from a decoder indicating whether or not the LTRF was correctlydecoded, wherein if the LTRF was incorrectly decoded then sendingadditional LTRFs until an LTRF is correctly decoded.